Search CORE

3,066 research outputs found

Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution

Author: Ahuja Narendra
Huang Jia-Bin
Lai Wei-Sheng
Yang Ming-Hsuan
Publication venue
Publication date: 09/10/2017
Field of study

Convolutional neural networks have recently demonstrated high-quality reconstruction for single-image super-resolution. In this paper, we propose the Laplacian Pyramid Super-Resolution Network (LapSRN) to progressively reconstruct the sub-band residuals of high-resolution images. At each pyramid level, our model takes coarse-resolution feature maps as input, predicts the high-frequency residuals, and uses transposed convolutions for upsampling to the finer level. Our method does not require the bicubic interpolation as the pre-processing step and thus dramatically reduces the computational complexity. We train the proposed LapSRN with deep supervision using a robust Charbonnier loss function and achieve high-quality reconstruction. Furthermore, our network generates multi-scale predictions in one feed-forward pass through the progressive reconstruction, thereby facilitates resource-aware applications. Extensive quantitative and qualitative evaluations on benchmark datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in terms of speed and accuracy.Comment: This work is accepted in CVPR 2017. The code and datasets are available on http://vllab.ucmerced.edu/wlai24/LapSRN

arXiv.org e-Print Archive

Crossref

Recommended from our members

Learning Spatial and Temporal Visual Enhancement

Author: Lai Wei-Sheng
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Visual enhancement is concerned with problems to improve the visual quality and viewing experience for images and videos. Researchers have been actively working on this area due to its theoretical and practical interest. However, obtaining high visual quality often comes with a cost of computational efficiency. With the growth of mobile applications and cloud services, it is crucial to develop effective and efficient algorithms for generating visually attractive images and videos. In this thesis, we address the visual enhancement problems in three aspects, including the spatial, temporal, and the joint spatial-temporal domains. We propose efficient algorithms based on deep convolutional neural networks for solving various visual enhancement problems.First, we address the problem of spatial enhancement for single-image super-resolution. We propose a deep Laplacian Pyramid Network to reconstruct a high-resolution image from an input low-resolution input in a coarse-to-fine manner. Our model directly extracts features from input LR images and progressively reconstructs the sub-band residuals. We train the proposed model with a multi-scale training, deep supervision, and robust loss functions to achieve state-of-the-art performance. Furthermore, we exploit the recursive learning technique to share parameters across and within pyramid levels to significantly reduce the model parameters. As most of the operations are performed on a low-resolution space, our model requires less memory and runs faster than state-of-the-art methods.Second, we address the temporal enhancement problem by learning the temporal consistency in videos. Given an input video and a per-frame processed video (processed by an existing image-based algorithm), we learn a recurrent network to reduce the temporal flickering and generate a temporally consistent video. We train the proposed network by minimizing both short-term and long-term temporal losses as well as a perceptual loss to strike a balance between temporal coherence and perceptual similarity with the processed frames. At test time, our model does not require computing optical flow and thus runs at 400+ FPS on GPU for high-resolution videos. Our model is task independent, where a single model can handle multiple and unseen tasks, including but not limited to artistic style transfer, enhancement, colorization, image-to-image translation and intrinsic image decomposition.Third, we address the spatial-temporal enhancement problem for video stitching. Inspired by the pushbroom cameras, we cast the stitching as a spatial interpolation problem. We propose a pushbroom stitching network to learn dense flow fields to smoothly align the input videos. The stitched videos can be generated from an efficient pushbroom interpolation layer. Our approach generates more temporally stable and visually pleasing results than existing video stitching approaches and commercial software. Furthermore, our algorithm has immediate applications in many areas such as virtual reality, immersive telepresence, autonomous driving, and video surveillance

eScholarship - University of California

Identifying a Transcription Factor’s Regulatory Targets from its Binding Targets

Author: Chang Julie S.
Lai Fred
Wu Wei-Sheng
Publication venue: Libertas Academica
Publication date: 01/01/2010
Field of study

ChIP-chip data, which shows binding of transcription factors (TFs) to promoter regions in vivo, are widely used by biologists to identify the regulatory targets of TFs. However, the binding of a TF to a gene does not necessarily imply regulation. Thus, it is important to develop computational methods which can extract a TF’s regulatory targets from its binding targets. We developed a method, called REgulatory Targets Extraction Algorithm (RETEA), which uses partial correlation analysis on gene expression data to extract a TF’s regulatory targets from its binding targets inferred from ChIP-chip data. We applied RETEA to yeast cell cycle microarray data and identified the plausible regulatory targets of eleven known cell cycle TFs. We validated our predictions by checking the enrichments for cell cycle-regulated genes, common cellular processes and common molecular functions. Finally, we showed that RETEA performs better than three published methods (MA-Network, TRIA and Garten et al’s method)

Crossref

Directory of Open Access Journals

PubMed Central

Automatic Composition Recommendations for Portrait Photography

Author: Chou Shih-Han
Lai Wei-Sheng
Shih Yichang
Zhuang Yifan
Publication venue: Technical Disclosure Commons
Publication date: 07/11/2022
Field of study

A user with no training in photography that takes pictures using a smartphone or other camera is often not able to capture attractive portrait photographs. This disclosure describes techniques to automatically determine optimal camera view-angles and frame elements, and to generate instructions to guide users to capture better composed photographs. An ultra-wide (UW) image is obtained via a stream parallel to a wide (W) image stream that the user previews during the capture of a photograph. The UW image is used as a guide to determine an optimal field of view (FoV) for the W-image, e.g., to determine an optimal foreground and background composition; to add elements that enhance artistic value; to omit elements that detract from artistic value; etc. Standard techniques of good photography, e.g., rule of thirds, optimal head orientation, etc. can be used to guide the user to obtain an optimal FoV that results in an attractive photograph

Technical Disclosure Common